-
Notifications
You must be signed in to change notification settings - Fork 93
feat: handle parameterConsistency option in YAML extensions #624
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
feat: handle parameterConsistency option in YAML extensions #624
Conversation
vbarua
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments. The big question I have is actually around what INCONSISTENT is supposed to mean. It's not super clear to me based on what's currently in the specification, and I'm not sure if the checks you've added in the FunctionConverter are correct. We can follow up with the upstream to clarify the intended behaviour for the Isthmus code.
The code you've added to the core to parse the parameterConsistency looks good to me, and stops substrait-java from blowing up if it encounters an extension with parameter consistency set as Ben pointed out in #622. If you make the core parsing changes their own PR, I'd be happy to merge those in.
isthmus/src/test/java/io/substrait/isthmus/expression/VariadicParameterConsistencyTest.java
Outdated
Show resolved
Hide resolved
core/src/test/java/io/substrait/extension/TypeExtensionTest.java
Outdated
Show resolved
Hide resolved
| // have the same type. | ||
| return wildcardToType.values().stream().allMatch(s -> s.size() == 1); | ||
| // When parameterConsistency is INCONSISTENT, wildcard types can differ. | ||
| // When parameterConsistency is CONSISTENT (or not variadic), wildcard types must be the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, the parameterConsistency behaviour only applies to the variadic arguments. It's not related to how we typecheck the enumerated wildcard types (i.e any1, any2, etc). From the (limited) documentation on it:
When the last argument of a function is variadic and declares a type parameter e.g. fn(A, B, C...), the C parameter can be marked as either consistent or inconsistent. If marked as consistent, the function can only be bound to arguments where all the C types are the same concrete type. If marked as inconsistent, each unique C can be bound to a different type within the constraints of what T allows.
CONSISTENT means that the types of all the variadic arguments have to be the same. INCONSISTENT means... I'm not sure tbh because in
each unique C can be bound to a different type within the constraints of what T allows.
I don't know what T is. This is something I can follow up with the community about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I realize that this is attempting to handle the second part of the ticket I linked (#622).
However, I don't believe this is the right place to handle it. Considering that parameterConsistency has meaning regardless of calcite, those checks should really be inside of core/ IMO.
|
|
||
| @Value.Default | ||
| default ParameterConsistency parameterConsistency() { | ||
| return ParameterConsistency.CONSISTENT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a reasonble default, because I think it's what most people expect in practice and it's also the first value in the enumeration.
We can formalize this in the spec more concretely.
benbellick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. Can you make sure to link #622 in your PR description? Also, do you plan on implementing the second piece of that ticket? Which is validating that variadic rguments to functions are consistent with the parameterConsistency argument. Let me know if you need me to clarify anything.
| // have the same type. | ||
| return wildcardToType.values().stream().allMatch(s -> s.size() == 1); | ||
| // When parameterConsistency is INCONSISTENT, wildcard types can differ. | ||
| // When parameterConsistency is CONSISTENT (or not variadic), wildcard types must be the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I realize that this is attempting to handle the second part of the ticket I linked (#622).
However, I don't believe this is the right place to handle it. Considering that parameterConsistency has meaning regardless of calcite, those checks should really be inside of core/ IMO.
benbellick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A lot of my comments are about relatively minor things. But the most important point is about the meaning of INCONSISTENT. I do think that the upstream could benefit from a bit more clarity there. Thanks for your work and let me know if anything I have said is unclear!
| SimpleExtension.VariadicBehavior variadicBehavior = variadic.get(); | ||
| if (variadicBehavior.parameterConsistency() | ||
| != SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT) { | ||
| // INCONSISTENT allows different types, so validation passes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So unfortunately, this is going to be a bit more complicated than it initially seemed. From the docs:
// Each argument can be any possible concrete type afforded by the bounds // of any parameter defined in the arguments specification.
Let me show you a (made-up) example:
urn: "urn:example:extension"
scalar_functions:
- name: "inconsistent_sum"
impls:
- args:
- value: "decimal<P,S>"
variadic:
min: 1
parameterConsistency: INCONSISTENT
return: "decimal<38,S>"
This means that the following are valid:
inconsistent_sum(decimal<10, 2>, decimal<10, 2>, decimal<10, 2>)inconsistent_sum(decimal<10, 2>, decimal<10, 2>)inconsistent_sum(decimal<15, 8>, decimal<11, 8>, decimal<119, 8>)
On the other hand, the following are all invalid:
inconsistent_sum(decimal<10, 2>, decimal<10, 3>, decimal<10, 4>)inconsistent_sum(decimal<10, 2>, i32)
The above example is showing that there is the implicit constraint across the variadic parameters that the scale S must be the same as the output (and thus must all be the same across the variadic parameter). On the other hand, the precision P is free to be anything. But all of the parameters do have to be decimal.
Not sure if that was instructive or not 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For readability, it may make sense to have two private methods that implement each behavior, then have the public method just be a switch on the parameter consistency options. Just an idea!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll open a different issue for this and a follow up PR, and we can add a big TODO comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might as well put the ticket number in the comment as well.
core/src/main/java/io/substrait/expression/VariadicParameterConsistencyValidator.java
Show resolved
Hide resolved
core/src/main/java/io/substrait/expression/VariadicParameterConsistencyValidator.java
Outdated
Show resolved
Hide resolved
| .options(java.util.Collections.emptyMap()) | ||
| .build(); | ||
|
|
||
| return Expression.ScalarFunctionInvocation.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry unresolving this one, did you maybe miss it?
| } | ||
|
|
||
| @Test | ||
| void testConsistentVariadicWithSameTypes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically every place here where you are using the builders directly, I think it would be better to use the ExpressionCreator helpers. E.g.
Expression.I64Literal.builder().value(1).build() -> ExpressionCreator.i8(false, 1)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also unresolving this one, did you maybe miss it?
|
|
||
| assertEquals( | ||
| SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT, | ||
| collection.scalarFunctions().get(0).variadic().get().parameterConsistency()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets be super thorough and also add a test to make sure that parameterConsistency: INCONSISTENT is also passed through.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also unresolving this one, think its also missed.
core/src/test/java/io/substrait/expression/VariadicParameterConsistencyTest.java
Show resolved
Hide resolved
core/src/test/java/io/substrait/expression/VariadicParameterConsistencyTest.java
Outdated
Show resolved
Hide resolved
benbellick
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments. Thanks!
| * that when parameterConsistency is CONSISTENT, all variadic arguments have the same type (ignoring | ||
| * nullability). | ||
| */ | ||
| public class VariadicParameterConsistencyValidator { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| public class VariadicParameterConsistencyValidator { | |
| class VariadicParameterConsistencyValidator { |
| SimpleExtension.VariadicBehavior variadicBehavior = variadic.get(); | ||
| if (variadicBehavior.parameterConsistency() | ||
| != SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT) { | ||
| // INCONSISTENT allows different types, so validation passes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You might as well put the ticket number in the comment as well.
| /** | ||
| * Validates that variadic arguments satisfy the parameter consistency requirement. When | ||
| * CONSISTENT, all variadic arguments must have the same type (ignoring nullability). When | ||
| * INCONSISTENT, arguments can have different types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets clarify this to say something other than "have different types". Maybe, "have any type satisfying the type constraints" or something?
| return (Type) arg; | ||
| } | ||
| }) | ||
| .collect(Collectors.toList()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if all of this filtering logic could be simplified if we just introduced an EnumArg type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think its fine to leave this for now, but just making note of it.
| // Count how many Expression/Type arguments are in the fixed arguments (before variadic) | ||
| // Note: func.args() includes all argument types (Expression, Type, EnumArg), but we only | ||
| // care about Expression/Type arguments for type consistency checking | ||
| int fixedTypeArgCount = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a better name may be something like nonVariadicArgCount? It would be nice to if you had an example e.g. sum(i32...) has a nonVariadicArgCount of 0. Or is it 1?
The reason why I am suggested an example is just that it isn't entirely clear to me whether or not the first non-variadic example is included in this list or not without closely reading the code.
| int fixedTypeArgCount = 0; | ||
| for (int i = 0; i < func.args().size() && i < arguments.size(); i++) { | ||
| FunctionArg arg = arguments.get(i); | ||
| if (arg instanceof Expression || arg instanceof Type) { | ||
| fixedTypeArgCount++; | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Take it or leave it, but since you are already doing it functional style already above. But yeah truly feel free to ignore if these seems more confusing.
| int fixedTypeArgCount = 0; | |
| for (int i = 0; i < func.args().size() && i < arguments.size(); i++) { | |
| FunctionArg arg = arguments.get(i); | |
| if (arg instanceof Expression || arg instanceof Type) { | |
| fixedTypeArgCount++; | |
| } | |
| } | |
| int fixedTypeArgCount = | |
| (int) | |
| arguments.stream() | |
| .limit(func.args().size()) | |
| .filter(arg -> arg instanceof Expression || arg instanceof Type) | |
| .count(); | |
| .options(java.util.Collections.emptyMap()) | ||
| .build(); | ||
|
|
||
| return Expression.ScalarFunctionInvocation.builder() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry unresolving this one, did you maybe miss it?
| } | ||
|
|
||
| @Test | ||
| void testConsistentVariadicWithSameTypes() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also unresolving this one, did you maybe miss it?
|
|
||
| assertEquals( | ||
| SimpleExtension.VariadicBehavior.ParameterConsistency.CONSISTENT, | ||
| collection.scalarFunctions().get(0).variadic().get().parameterConsistency()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also unresolving this one, think its also missed.
This pull request introduces improved handling and validation of variadic parameter consistency for function signatures, particularly distinguishing between consistent and inconsistent parameter types in variadic arguments. The changes ensure that function argument matching logic respects the
parameterConsistencysetting and are thoroughly tested with new unit tests.Enhancements to variadic parameter consistency:
FunctionConverterso that, whenparameterConsistencyis set toCONSISTENT, all variadic arguments must have the same type (ignoring nullability), and when set toINCONSISTENT, variadic arguments can differ in type. This logic is now explicitly enforced in theinputTypesMatchDefinedArgumentsmethod. [1] [2] [3]parameterConsistencyin theSimpleExtension.VariadicBehaviorinterface, defaulting toCONSISTENTif not specified.FIxes: #622